Because I keep saying I will remember how do do things and then I don’t so I’m putting the bookmarked links and comments into one place to try and help me spend less time searching for answers.

1 ggplot

# load-packages
library(tidyverse)

1.2 hjust and vjust

Why do I always forget the direction of these?

hjust: 0 = left-aligned, 0.5=center, 1 = right-aligned
vjust: 0 = top-aligned, 0.5=middle, 1 = bottom-aligned

Visual Example - R-bloggers

1.3 Math Expressions in labels

1.3.1 Use quote()

ggplot(mpg, aes(displ, hwy))+geom_point()+
  ggtitle(
    quote(
      alpha ^ 2 - frac(1, 10) + sum(n[i], i==1, N)
                )
    )

1.3.2 Use TeX() from the latex2exp package

  • must be in a string
  • must be denoted as math mode with dollar signs
  • must include 2 backslashes for \(\LaTeX\) commands
library(latex2exp)
## Warning: package 'latex2exp' was built under R version 4.0.3
ggplot(mpg, aes(displ, hwy))+geom_point()+
  ggtitle(TeX(
    "$\\alpha^2 - \\frac{1}{10} + \\sum_{i}^N n_i$"
                )
    )

1.5 Line up axes on stacked plots

Sometimes I’m working on two different types of plots (like a bar chart and a scatter plot) that happen to have the same x-axis. I want to line up these axes so that when the plots are stacked the values correspond to the same date.

1.5.1 Same geom type

# two different bar charts
A <- ggplot(mpg, aes(class))+geom_bar()+coord_flip()+ylim(0, 109)
B <- ggplot(mpg, aes(drv))+geom_bar()+coord_flip()+ylim(0, 109)

Using grid.arrange command from the gridExtra package does not line up axes.

#axes don't line up
gridExtra::grid.arrange(A, B, ncol=1)

Use grid.draw command from the grid package.
Source

#make plots into Grobs (grid graphical object)
gA <- ggplotGrob(A) 
gB <- ggplotGrob(B)
grid::grid.draw(rbind(gA, gB))

Another option is facet_wrap, which works great if the axes are the same for the different varaibles you want to compare, but isn’t a perfect solution to everything (see below).

tidy.df <- pivot_longer(mpg, c(class, drv))
ggplot(tidy.df, aes(value))+geom_bar()+facet_wrap(~name)

1.5.2 Bar Chart and Scatter Plots

Scatter plots and bar charts will not line up automatically, even when using the grid.draw command detailed above. This is because their default limits are different given that the bar chart is centered on the value and the scatter plot is a single point on the value.

#work with smaller subset of data from economics, part of ggplot2 package 
startdate <- "2014-06-01"
economics_small <- economics %>%
  filter(date >= as.Date(startdate)) %>%
  arrange(date)
A <- ggplot(economics_small, aes(date, unemploy))+
  geom_bar(stat="identity")+
  geom_vline(xintercept = as.Date(startdate), color="red", size=2)

B <- ggplot(economics_small, aes(date, uempmed))+
  geom_point()+geom_line()+
  geom_vline(xintercept = as.Date(startdate), color="red", size=2)

gA <- ggplotGrob(A) 
gB <- ggplotGrob(B)
grid::grid.draw(rbind(gA, gB))

In order to line the up there a a couple of options.

1.5.2.1 Fix xlim for all charts

If you make the limit the first x-value, the bar chart will not show up (remember it’s centered over the value).

A <- ggplot(economics_small, aes(date, unemploy))+
  geom_bar(stat="identity")+
  geom_vline(xintercept = as.Date(startdate), color="red", size=2)+
  xlim(as.Date(startdate), NA)

B <- ggplot(economics_small, aes(date, uempmed))+
  geom_point()+geom_line()+
  geom_vline(xintercept = as.Date(startdate), color="red", size=2)+
  xlim(as.Date(startdate), NA)

gA <- ggplotGrob(A) 
## Warning: Removed 1 rows containing missing values (geom_bar).
gB <- ggplotGrob(B)
grid::grid.draw(rbind(gA, gB))

This can be fixed by adding a half unit to the x-axis (i.e. having the lower limit be half-unit lower than smallest x-value). In this case the unit is a month, so a half-unit would be ~15 days.

HalfUnit <- .5*(economics_small$date[2] - economics_small$date[1])
HalfUnit
## Time difference of 15 days
A <- ggplot(economics_small, aes(date, unemploy))+
  geom_bar(stat="identity")+
  geom_vline(xintercept = as.Date(startdate), color="red", size=2)+
  xlim(as.Date(startdate)-HalfUnit, NA)

B <- ggplot(economics_small, aes(date, uempmed))+
  geom_point()+geom_line()+
  geom_vline(xintercept = as.Date(startdate), color="red", size=2)+
  xlim(as.Date(startdate)-HalfUnit, NA)

gA <- ggplotGrob(A) 
gB <- ggplotGrob(B)
grid::grid.draw(rbind(gA, gB))

1.5.3 Shift Bar chart to right

Bar charts are automatically centered over the x-value. Bar charts (and any geom object) can be shifted by using position - position_nudge()). The shift needs to be half a unit on the x-axis, again here it is monthly data so a half unit would be ~15 days.
Source

A <- ggplot(economics_small, aes(date, unemploy))+
  geom_bar(stat="identity", position = position_nudge(x = as.vector(HalfUnit)))+
  geom_vline(xintercept = as.Date(startdate), color="red", size=2)

B <- ggplot(economics_small, aes(date, uempmed))+
  geom_point()+geom_line()+
  geom_vline(xintercept = as.Date(startdate), color="red", size=2)

gA <- ggplotGrob(A) 
gB <- ggplotGrob(B)

grid::grid.draw(rbind(gA, gB))

2 if else

test expression goes in parenthesis () and the statment goes in the curly brakets {}

2.1 If

if (test) { statment } 

2.2 If else

R is a bit finicky with where the brakets go; I get errors when I put else on a new line by itself - it wants to have the right braket before it; } else

if (test) {
statment #1
} else {
statment #2 
}

2.3 If elseif elseif … else

if (test) {
statment #1
} elseif {
statment #2 
} elseif {
statment #3 
} else {
statment #4 
}

3 Images

3.1 Markdown

![this is Daffodil and Blossom](Peegs.jpg) this is Daffodil and Blossom

Re sizing images in Markdown is required if you are knitting to a pdf - because you can’t use HTML code.

Tip and tricks for workign wtih images and figures in R Markdown documents - hollie@zevross.com

Adjust the out.width and out.height in the R chunk options

{r, out.width="50%"}  
img <- "Peegs.jpg" #path to image  
knitr::include_graphics(img) #in the knitr package   

3.2 HTML in markdown

In my opinion, HTML is a lot easier to use for images options.

HTML Images

<img src="Peegs.jpg" alt="this is Daffodil and Blossom" width="50%">
this is Daffodil and Blossom

4 Correlation

4.1 base R correlation: cor()

cor(mtcars[,1:4])
##             mpg        cyl       disp         hp
## mpg   1.0000000 -0.8521620 -0.8475514 -0.7761684
## cyl  -0.8521620  1.0000000  0.9020329  0.8324475
## disp -0.8475514  0.9020329  1.0000000  0.7909486
## hp   -0.7761684  0.8324475  0.7909486  1.0000000

If data has NA’s in any of the values the cor() will results in NA. If you want to remove the NA’s when calculating correlation do:

cor(..., use = "complete.obs")

Source: https://stackoverflow.com/questions/3798998/cor-shows-only-na-or-1-for-correlations-why

4.2 Dot Plots for Multiple Variables: pairs()

That chart that plots all variables against eachother as a dot plot when looking to see if variables are correlated with eachother

#lots of variables so only look at first 4
testdf <- mtcars[,1:4]
pairs(testdf, main = "title")